Goto

Collaborating Authors

 Construction Machinery & Heavy Trucks


Development of CPS Platform for Autonomous Construction

arXiv.org Artificial Intelligence

In recent years, labor shortages due to the declining birthrate and aging population have become significant challenges at construction sites in developed countries, including Japan. To address these challenges, we are developing an open platform called ROS2-TMS for Construction, a Cyber-Physical System (CPS) for construction sites, to achieve both efficiency and safety in earthwork operations. In ROS2-TMS for Construction, the system comprehensively collects and stores environmental information from sensors placed throughout the construction site. Based on these data, a real-time virtual construction site is created in cyberspace. Then, based on the state of construction machinery and environmental conditions in cyberspace, the optimal next actions for actual construction machinery are determined, and the construction machinery is operated accordingly. In this project, we decided to use the Open Platform for Earthwork with Robotics and Autonomy (OPERA), developed by the Public Works Research Institute (PWRI) in Japan, to control construction machinery from ROS2-TMS for Construction with an originally extended behavior tree. In this study, we present an overview of OPERA, focusing on the newly developed navigation package for operating the crawler dump, as well as the overall structure of ROS2-TMS for Construction as a Cyber-Physical System (CPS). Additionally, we conducted experiments using a crawler dump and a backhoe to verify the aforementioned functionalities.


Federated Learning framework for LoRaWAN-enabled IIoT communication: A case study

arXiv.org Artificial Intelligence

The development of intelligent Industrial Internet of Things (IIoT) systems promises to revolutionize operational and maintenance practices, driving improvements in operational efficiency. Anomaly detection within IIoT architectures plays a crucial role in preventive maintenance and spotting irregularities in industrial components. However, due to limited message and processing capacity, traditional Machine Learning (ML) faces challenges in deploying anomaly detection models in resource-constrained environments like LoRaWAN. On the other hand, Federated Learning (FL) solves this problem by enabling distributed model training, addressing privacy concerns, and minimizing data transmission. This study explores using FL for anomaly detection in industrial and civil construction machinery architectures that use IIoT prototypes with LoRaWAN communication. The process leverages an optimized autoencoder neural network structure and compares federated models with centralized ones. Despite uneven data distribution among machine clients, FL demonstrates effectiveness, with a mean F1 score (of 94.77), accuracy (of 92.30), TNR (of 90.65), and TPR (92.93), comparable to centralized models, considering airtime of trainning messages of 52.8 min. Local model evaluations on each machine highlight adaptability. At the same time, the performed analysis identifies message requirements, minimum training hours, and optimal round/epoch configurations for FL in LoRaWAN, guiding future implementations in constrained industrial environments.


Language Supervised Human Action Recognition with Salient Fusion: Construction Worker Action Recognition as a Use Case

arXiv.org Artificial Intelligence

Detecting human actions is a crucial task for autonomous robots and vehicles, often requiring the integration of various data modalities for improved accuracy. In this study, we introduce a novel approach to Human Action Recognition (HAR) based on skeleton and visual cues. Our method leverages a language model to guide the feature extraction process in the skeleton encoder. Specifically, we employ learnable prompts for the language model conditioned on the skeleton modality to optimize feature representation. Furthermore, we propose a fusion mechanism that combines dual-modality features using a salient fusion module, incorporating attention and transformer mechanisms to address the modalities' high dimensionality. This fusion process prioritizes informative video frames and body joints, enhancing the recognition accuracy of human actions. Additionally, we introduce a new dataset tailored for real-world robotic applications in construction sites, featuring visual, skeleton, and depth data modalities, named VolvoConstAct. This dataset serves to facilitate the training and evaluation of machine learning models to instruct autonomous construction machines for performing necessary tasks in the real world construction zones. To evaluate our approach, we conduct experiments on our dataset as well as three widely used public datasets, NTU-RGB+D, NTU-RGB+D120 and NW-UCLA. Results reveal that our proposed method achieves promising performance across all datasets, demonstrating its robustness and potential for various applications. The codes and dataset are available at: https://mmahdavian.github.io/ls_har/


Feedforward Controllers from Learned Dynamic Local Model Networks with Application to Excavator Assistance Functions

arXiv.org Artificial Intelligence

Complicated first principles modelling and controller synthesis can be prohibitively slow and expensive for high-mix, low-volume products such as hydraulic excavators. Instead, in a data-driven approach, recorded trajectories from the real system can be used to train local model networks (LMNs), for which feedforward controllers are derived via feedback linearization. However, previous works required LMNs without zero dynamics for feedback linearization, which restricts the model structure and thus modelling capacity of LMNs. In this paper, we overcome this restriction by providing a criterion for when feedback linearization of LMNs with zero dynamics yields a valid controller. As a criterion we propose the bounded-input bounded-output stability of the resulting controller. In two additional contributions, we extend this approach to consider measured disturbance signals and multiple inputs and outputs. We illustrate the effectiveness of our contributions in a hydraulic excavator control application with hardware experiments. To this end, we train LMNs from recorded, noisy data and derive feedforward controllers used as part of a leveling assistance system on the excavator. In our experiments, incorporating disturbance signals and multiple inputs and outputs enhances tracking performance of the learned controller. A video of our experiments is available at https://youtu.be/lrrWBx2ASaE.


DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training

arXiv.org Artificial Intelligence

Swift and accurate detection of specified objects is crucial for many industrial applications, such as safety monitoring on construction sites. However, traditional approaches rely heavily on arduous manual annotation and data collection, which struggle to adapt to ever-changing environments and novel target objects. To address these limitations, this paper presents DART, an automated end-to-end pipeline designed to streamline the entire workflow of an object detection application from data collection to model deployment. DART eliminates the need for human labeling and extensive data collection while excelling in diverse scenarios. It employs a subject-driven image generation module (DreamBooth with SDXL) for data diversification, followed by an annotation stage where open-vocabulary object detection (Grounding DINO) generates bounding box annotations for both generated and original images. These pseudo-labels are then reviewed by a large multimodal model (GPT-4o) to guarantee credibility before serving as ground truth to train real-time object detectors (YOLO). We apply DART to a self-collected dataset of construction machines named Liebherr Product, which contains over 15K high-quality images across 23 categories. The current implementation of DART significantly increases average precision (AP) from 0.064 to 0.832. Furthermore, we adopt a modular design for DART to ensure easy exchangeability and extensibility. This allows for a smooth transition to more advanced algorithms in the future, seamless integration of new object categories without manual labeling, and adaptability to customized environments without extra data collection. The code and dataset are released at https://github.com/chen-xin-94/DART.


Design and Simulation of Time-energy Optimal Anti-swing Trajectory Planner for Autonomous Tower Cranes

arXiv.org Artificial Intelligence

For autonomous crane lifting, optimal trajectories of the crane are required as reference inputs to the crane controller to facilitate feedforward control. Reducing the unactuated payload motion is a crucial issue for under-actuated tower cranes with spherical pendulum dynamics. The planned trajectory should be optimal in terms of both operating time and energy consumption, to facilitate optimum output spending optimum effort. This article proposes an anti-swing tower crane trajectory planner that can provide time-energy optimal solutions for the Computer-Aided Lift Planning (CALP) system developed at Nanyang Technological University, which facilitates collision-free lifting path planning of robotized tower cranes in autonomous construction sites. The current work introduces a trajectory planning module to the system that utilizes the geometric outputs from the path planning module and optimally scales them with time information. Firstly, analyzing the non-linear dynamics of the crane operations, the tower crane is established as differentially flat. Subsequently, the multi-objective trajectory optimization problems for all the crane operations are formulated in the flat output space through consideration of the mechanical and safety constraints. Two multi-objective evolutionary algorithms, namely Non-dominated Sorting Genetic Algorithm (NSGA-II) and Generalized Differential Evolution 3 (GDE3), are extensively compared via statistical measures based on the closeness of solutions to the Pareto front, distribution of solutions in the solution space and the runtime, to select the optimization engine of the planner. Finally, the crane operation trajectories are obtained via the corresponding planned flat output trajectories. Studies simulating real-world lifting scenarios are conducted to verify the effectiveness and reliability of the proposed module of the lift planning system.


Estimation of articulated angle in six-wheeled dump trucks using multiple GNSS receivers for autonomous driving

arXiv.org Artificial Intelligence

Due to the declining birthrate and aging population, the shortage of labor in the construction industry has become a serious problem, and increasing attention has been paid to automation of construction equipment. We focus on the automatic operation of articulated six-wheel dump trucks at construction sites. For the automatic operation of the dump trucks, it is important to estimate the position and the articulated angle of the dump trucks with high accuracy. In this study, we propose a method for estimating the state of a dump truck by using four global navigation satellite systems (GNSSs) installed on an articulated dump truck and a graph optimization method that utilizes the redundancy of multiple GNSSs. By adding real-time kinematic (RTK)-GNSS constraints and geometric constraints between the four antennas, the proposed method can robustly estimate the position and articulation angle even in environments where GNSS satellites are partially blocked. As a result of evaluating the accuracy of the proposed method through field tests, it was confirmed that the articulated angle could be estimated with an accuracy of 0.1$^\circ$ in an open-sky environment and 0.7$^\circ$ in a mountainous area simulating an elevation angle of 45$^\circ$ where GNSS satellites are blocked.


GREEMA: Proposal and Experimental Verification of Growing Robot by Eating Environmental MAterial for Landslide Disaster

arXiv.org Artificial Intelligence

In areas that are inaccessible to humans, such as the lunar surface and landslide sites, there is a need for multiple autonomous mobile robot systems that can replace human workers. In particular, at landslide sites such as river channel blockages, robots are required to remove water and sediment from the site as soon as possible. Conventionally, several construction machines have been deployed to the site for civil engineering work. However, because of the large size and weight of conventional construction equipment, it is difficult to move multiple units of construction equipment to the site, resulting in significant transportation costs and time. To solve such problems, this study proposes a novel growing robot by eating environmental material called GREEMA, which is lightweight and compact during transportation, but can function by eating on environmental materials once it arrives at the site. GREEMA actively takes in environmental materials such as water and sediment, uses them as its structure, and removes them by moving itself. In this paper, we developed and experimentally verified two types of GREEMAs. First, we developed a fin-type swimming robot that passively takes water into its body using a water-absorbing polymer and forms a body to express its swimming function. Second, we constructed an arm-type robot that eats soil to increase the rigidity of its body. We discuss the results of these two experiments from the viewpoint of Explicit-Implicit control and describe the design theory of GREEMA.


Do we need scan-matching in radar odometry?

arXiv.org Artificial Intelligence

There is a current increase in the development of "4D" Doppler-capable radar and lidar range sensors that produce 3D point clouds where all points also have information about the radial velocity relative to the sensor. 4D radars in particular are interesting for object perception and navigation in low-visibility conditions (dust, smoke) where lidars and cameras typically fail. With the advent of high-resolution Doppler-capable radars comes the possibility of estimating odometry from single point clouds, foregoing the need for scan registration which is error-prone in feature-sparse field environments. We compare several odometry estimation methods, from direct integration of Doppler/IMU data and Kalman filter sensor fusion to 3D scan-to-scan and scan-to-map registration, on three datasets with data from two recent 4D radars and two IMUs. Surprisingly, our results show that the odometry from Doppler and IMU data alone give similar or better results than 3D point cloud registration. In our experiments, the average position error can be as low as 0.3% over 1.8 and 4.5km trajectories. That allows accurate estimation of 6DOF ego-motion over long distances also in feature-sparse mine environments. These results are useful not least for applications of navigation with resource-constrained robot platforms in feature-sparse and low-visibility conditions such as mining, construction, and search & rescue operations.


Data-driven models for predicting the outcome of autonomous wheel loader operations

arXiv.org Artificial Intelligence

This paper presents a method using data-driven models for selecting actions and predicting the total performance of autonomous wheel loader operations over many loading cycles in a changing environment. The performance includes loaded mass, loading time, work. The data-driven models input the control parameters of a loading action and the heightmap of the initial pile state to output the inference of either the performance or the resulting pile state. By iteratively utilizing the resulting pile state as the initial pile state for consecutive predictions, the prediction method enables long-horizon forecasting. Deep neural networks were trained on data from over 10,000 random loading actions in gravel piles of different shapes using 3D multibody dynamics simulation. The models predict the performance and the resulting pile state with, on average, 95% accuracy in 1.2 ms, and 97% in 4.5 ms, respectively. The performance prediction was found to be even faster in exchange for accuracy by reducing the model size with the lower dimensional representation of the pile state using its slope and curvature. The feasibility of long-horizon predictions was confirmed with 40 sequential loading actions at a large pile. With the aid of a physics-based model, the pile state predictions are kept sufficiently accurate for longer-horizon use.